Cherry Blossom: A System for Japanese Character Recognition
نویسندگان
چکیده
A general purpose Japanese character recognition system, Cherry Blossom, has been developed at CEDAR in past years. It is designed to recognize Japanese document images in low resolution or with poor print quality. The system includes modules for page skew correction, document segmentation, text segmentation, character recognition and postprocessing. The API code for each module has been developed so that each module can be tested as a standalone program or can be called in large application systems such as a document indexing and retrieval system. In the character recognition module, two classi cation methods, the nearest-neighbor classi er and the subspace method, have been integrated in an efcient way. The speed of character recognition is about six characters per second from the original one character per second. New techniques, including dynamic feature selection, incremental nearest prototype search and visual similarity analysis, have been developed to speed up character classi cation while keeping high accuracy. A linguistic postprocessing module improves the accuracy of character recognition by exploiting linguistic context. It was trained on a Japanese text corpus with above 70 mil-
منابع مشابه
Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملImpact of Climate Variability on Flowering Phenology and Its Implications for the Schedule of Blossom Festivals
Many tourism destinations characterized by spring blossom festivals (e.g., cherry blossom festival) became increasingly popular around the world. Usually, spring blossom festivals should be planned within the flowering period of specific ornamental plants. In the context of climate and phenological change, whether the administrators of tourism destinations had perceived and responded to the flo...
متن کاملAn On-line Handwritten Japanese Text Recognition System Free from Line Direction and Character Orientation Constraints
This paper describes an on-line handwritten Japanese text recognition system that is liberated from constraints on line direction and character orientation. The recognition system first separates freely written text into text line elements, second estimates the line direction and character orientation using the time sequence information of pen-tip coordinates, third hypothetically segment it in...
متن کاملSurvey of Pattern Recognition Approaches in Japanese Character Recognition
Optical Character Recognition (OCR) in Japanese, both handwritten and printed, is difficult to perform, owing to several reasons. Firstly, the Japanese language is comprised of over 3000 characters which can be classified as syllabic characters, or Kana, and ideographic characters, called Kanji. Secondly, Japanese text does not have delimiters like spaces, separating different words. Thirdly, s...
متن کاملA Model of On-line Handwritten Japanese Text Recognition Free from Line Direction and Writing Format Constraints
This paper presents a model and its effect for on-line handwritten Japanese text recognition free from line-direction constraint and writing format constraint such as character writing boxes or ruled lines. The model evaluates the likelihood composed of character segmentation, character recognition, character pattern structure and context. The likelihood of character pattern structure considers...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997